home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Language/OS - Multiplatform Resource Library
/
LANGUAGE OS.iso
/
cocktail
/
cocktail.txt
< prev
next >
Wrap
Text File
|
1993-07-24
|
20KB
|
469 lines
*************************************************************************
* *
* Compiler Construction Tool Box *
* ============================== *
* *
* Version 9209 *
* *
* Copyright (c) 1989, 1990, 1991, 1992 by *
* *
* Gesellschaft fuer Mathematik und Datenverarbeitung *
* (German National Research Center for Computer Science) *
* Forschungsstelle fuer Programmstrukturen *
* an der Universitaet Karlsruhe *
* *
* All rights reserved. GMD assumes no responsibility for the use *
* or reliability of its software. *
* *
*************************************************************************
Direct requests, comments, questions, and error reports to:
Josef Grosch
GMD Forschungsstelle
Vincenz-Priessnitz-Str. 1
D-7500 Karlsruhe 1
Phone: +721-662226
Email: grosch@karlsruhe.gmd.de
Distribution Format:
--------------------
The compiler construction tool box is available via anonymous ftp from
several file servers:
host number directory
ftp.karlsruhe.gmd.de 192.76.241.33 /pub/cocktail
ftp.gmd.de 129.26.8.90 /gmd/cocktail
iraun1.ira.uka.de 129.13.10.90 /pub/programming/cocktail
ftp.rus.uni-stuttgart.de 129.69.1.12 /soft/unix/programming/compilerbau
ftp.th-darmstadt.de 130.83.55.75 /pub/programming/languages/compiler-compiler/cocktail
ftp.eb.ele.tue.nl /pub/src/cocktail
src.doc.ic.ac.uk 146.169.2.1 /languages/tools/gmd
gatekeeper.dec.com /.3/plan/gmd
The files or file trees are compressed: For UNIX tar and compress are used
(suffix: .tar.Z), for MSDOS zip is used (suffix: .zip).
To uncompress a file tree use:
uncompress < tool.tar.Z | tar xvf - or unzip tool.zip
Alternatively, the toolbox is distributed on tape in uncompressed tar format
on the following media:
- DC300/600 data cartridge (streamer tape)
- TK 50
- Exabyte
- 1/2" magnetic tape (1600 bpi)
To read a tape use:
tar -xvfb /dev/rst0 20 or tar -xvb 20 or similar commands
The directories and their contents are as follows:
--------------------------------------------------
directory contents
------------------------------------------------------------------------
README this file
Makefile compilation, installation, and test of the tools (UNIX)
compile.bat compilation of the tools (MSDOS)
install.bat installation of the tools (MSDOS)
test.bat test of the tools (MSDOS)
doc.ps documentation in postscript format
doc.me documentation in troff format, me macros
doc.doc documentation in ascii format (without pictures)
man manual pages in troff format, man macros
rex Scanner Generator
lalr LALR(1) Parser Generator
ell LL(1) Recursive Descent Parser Generator
bnf Transforms Grammars from Extended BNF to Plain BNF
front Common Front-End of Lalr, Ell, and Bnf
reuse Library of Reusable Modules (needed for all programs)
common Library for estra and ell
specs Example Specifications for the Above Tools
cg Common Program implementing Ast and Ag
Ast = Generator for Abstract Syntax Trees
Ag = Attribute Evaluator Generator
puma Transformation Tool based on Pattern Matching
l2r Transforms Lex input to Rex input
y2l Transforms Yacc input to Lalr input
r2l Transforms Rex input to Lex input
rpp Rex PreProcessor: rpp + cg extract most of a scanner
specification out of a parser specification
estra Transformation of attributed trees (prototype)
hexa contains the scanner and parser tables of Rex and Front
(= front-end of Lalr and Bnf) converted from binary to
ascii hexadecimal representation
dos directory containg the MSDOS version
bin UNIX: shell scripts (my version), MSDOS: batch scripts
lib executables, table and data files (for SUN 3/SunOS 4.0 or PC/MSDOS)
(mtc Modula-2 to C translator)
The names of the subdirectories indicate the following types of information:
sub directory contents
------------------------------------------------------------------------
src source files in Modula-2
m2c source files in C (generated from the Modula-2 sources)
src source files in C (generated from the C sources for MSDOS)
c source files in C (hand-written)
lib data files, module skeletons
test test environment for a tool
Documentation:
--------------
The directories doc.ps, doc.me, and doc.doc contain documentation in postscript
format, troff format (me macros), and in ASCII format (without pictures).
The documentation for UNIX and MSDOS is the same. Therefore the documents are
stored in the UNIX directory, only, they are not repeated in the 'dos' subdirectory.
The document entitled "Toolbox Introduction" in the files intro.ps, intro.me,
or intro.doc gives an overview and introduces into the toolbox. It should be
read first. The following documents are available:
Filename Title
------------------------------------------------------------------------
intro Toolbox Introduction
toolbox A Tool Box for Compiler Construction
werkzeuge Werkzeuge fu"r den U"bersetzerbau
reuse Reusable Software - A Collection of Modula-2-Modules
prepro Preprocessors
rex Rex - A Scanner Generator
scanex Selected Examples of Scanner Specifications
scangen Efficient Generation of Table-Driven Scanners
lalr-ell The Parser Generators Lalr and Ell
lalr Lalr - A Generator for Efficient Parsers
ell Efficient and Comfortable Error Recovery in Recursive
Descent Parsers
highspeed Generators for High-Speed Front-Ends
autogen Automatische Generierung effizienter Compiler
ast Ast - A Generator for Abstract Syntax Trees
toolsupp Tool Support for Data Structures
ag Ag - An Attribute Evaluator Generator
ooags Object-Oriented Attribute Grammars
estra Spezifikation und Implementierung der Transformation
attributierter Ba"ume
puma Puma - A Generator for the Transformation of Attributed Trees
trafo Transformation of Attributed Trees Using Pattern Matching
(minilax Specification of a MiniLAX-Interpreter)
(begmanual BEG - a Back End Generator - User Manual)
References:
-----------
1. J. Grosch, `Generators for High-Speed Front-Ends', LNCS,
371, 81-92 (Oct. 1988), Springer Verlag.
2. H. Emmelmann, F. W. Schroeer, Rudolf Landwehr, ` BEG - a Generator
for Efficient Back Ends', Sigplan Notices, 24, 227-237 (Jul. 1989)
3. W. M. Waite, J. Grosch and F. W. Schroeer, `Three Compiler
Specifications', GMD-Studie Nr. 166, GMD Forschungsstelle an
der Universitaet Karlsruhe, Aug. 1989.
4. J. Grosch, `Efficient Generation of Lexical Analysers',
Software-Practice & Experience, 19, 1089-1103 (Nov. 1989).
5. J. Grosch, `Efficient and Comfortable Error Recovery in Recursive
Descent Parsers', Structured Programming, 11, 129-140 (1990).
6. J. Grosch, H. Emmelmann, `A Tool Box for Compiler Construction',
LNCS, 477, 106-116 (Oct. 1990), Springer Verlag.
7. J. Grosch, `Object-Oriented Attribute Grammars', in: Proceedings of the
Fifth International Symposium on Computer and Information Sciences (ISCIS V)
(Eds. A. E. Harmanci, E. Gelenbe), Cappadocia, Nevsehir, Turkey, 807-816,
(Oct. 1990).
8. J. Grosch, `Lalr - a Generator for Efficient Parsers',
Software-Practice & Experience, 20, 1115-1135 (Nov. 1990).
9. J. Grosch, `Tool Support for Data Structures',
Structured Programming, 12, 31-38 (1991).
10. J. Grosch, `Transformation of Attributed Trees Using Pattern Matching',
to appear (1992).
Machine Dependencies:
---------------------
All machine dependent code is isolated in the file System.c which is written
in C. This file is set up to work under UNIX and MSDOS. There are three
copies of this file in the following directories:
reuse/c
reuse/src
reuse/m2c
The UNIX command 'install' is used during installation. Unfortunately, this
command is not as standard as it should be. If 'install' is missing on your
machine or it complains about the calls then the shell script in the file
hexa/install can simulate the desired behaviour.
Installation under UNIX:
------------------------
The Makefile at the global level controls the compilation and installation
of the individual tools. It activates the tool specific Makefiles.
Several tools use binary data files called Scan.Tab and Pars.Tab whose internal
representation depends on whether the machine is little-endian or big-endian.
All machines store integer numbers in a sequence of bytes with increasing adresses.
An integer number usually consists of four bytes where the bytes at both ends are
termed most significat byte (MSB) and least significant byte (LSB). Big-endian
machines store the MSB at the lowest address and the LSB at the highest address.
Little-endian machines store the bytes the other way round.
Big-endian are e. g. MC 680x0, SUN/3, SUN/4, SPARC,
little-endian are e. g. VAX, DEC Station, 80386, 80486.
Initially the binary data files are configured for big-endian machines.
To find out in which class your machine is in execute:
make endian
To convert the binary files from big-endian to little-endian or vice versa execute:
make bin.conv
Edit the first couple of lines in the Makefile to accomodate your needs.
To compile the programs execute:
make
To install the programs execute:
make install
Installation under MSDOS:
-------------------------
Requirements to run the binaries:
- 80386 or 80486 CPU
- at least 2 MB memory
- DOS extender (go32, from DJ Delorie)
Requirements to compile the sources:
- GNU C compiler (gcc, DJ Delorie's GCC port to DOS)
- GNU make (called gmake on our machine)
- ar and ranlib (archive handling)
The software of DJ Delorie's GCC port to DOS is available from:
host: grape.ecs.clarkson.edu
login: ftp
directory: /pub/djgcc
Installation from the sources:
There are several batch files at the global level for compilation and installation
of the individual tools. These activate either tool specific batch files or
GNU make (gmake) which is controlled by tool specific Makefiles.
Define four environment variables describing two directories that will store the
batch files that control the tools and the binaries and data files of the tools.
UBIN and ULIB hold the same names as BIN and LIB but use the UNIX style
(/ instead of \). For example:
set BIN=\usr\hacker\bin
set LIB=\usr\hacker\lib
set UBIN=/usr/hacker/bin
set ULIB=/usr/hacker/lib
To compile the programs execute:
compile.bat
To install the programs execute:
install.bat
To test the programs execute:
test.bat
Installation of the binaries:
Copy the directories bin and lib to an appropriate location and set up the
four environement variables as described above. That's it. To test the programs
execute:
test.bat
Recent Changes
--------------
Version 9209:
- Port to MSDOS.
Version 9208:
- The scanner generator 'rex' and the parser generators 'lalr' and 'ell'
allow to chose arbitrary names for the generated modules. Therefore, it is
possible to have several scanners and parsers in one program.
- The length of a token and the lookahead in scanners generated by 'rex' is no
longer restricted to 256 characters. Both, tokens and lookahead can be of
arbitrary length. A restriction in the size of the tables generated by 'lalr'
has been removed. Now it is possible to generate rather huge parsers.
- The attribute grammar tool 'ag' has been extended to generate attribute
evaluators for well-defined attribute grammars (WAGs). The program checks
grammars whether they obey this property. It is possible to access
non-local attributes and to compute attributes on a restricted form of
graphs.
- The auxiliary modules 'Errors' and 'Source' have been included into the
library of reusable modules called 'reuse'. The 'Errors' module has been
extended to support messages with a string argument. It allows to store
the messages and print them sorted by the source position. An extra module
named 'Positions' has been introduced in 'reuse', too, for the handling of
source positions.
- The program 'cg' which implements 'ast' and 'ag' accepts several input files.
Instead of one file that communicates a tree definition to 'puma' with the
fixed name 'TREE.TS' it is possible to produce several of those with different
names. This is of interest if different "views" have to be communicated.
- The extern declarations for malloc, free, and exit have been removed from
the generated C code.
- All tools do not generate # line directives by default, only upon request.
- In case of fatal errors during the execution of generated modules a user
defined exception routine can be called instead of the predefined 'exit (1)'.
Version 9202:
- The Toolbox contains a new tool for the transformation of attributed trees
called 'puma'. It is based on pattern-matching. It replaces its predecessor
'estra' and comes with documentation in English.
- The tool for abstract syntax trees 'ast' has been extended from single to
multiple inheritance. So-called "subunits" allow the implementation of one
abstract tree by several compilation units. The concept of "views" makes
it possible to derive from a common specification several abstract syntax
trees which represent subsets.
- The attribute grammar tool 'ag' has analogously been extended to process
object-oriented attribute grammars with multiple inheritance. It supports the
generation of several attribute evaluators that run one after the other.
- The error handling module for the parser generators 'lalr' and 'ell' are
independent of the parser module or the grammar. The manual for these parser
generators has been completely rewritten.
- The Modula-2 to C translator 'mtc' has a new code-generator. This brings
big efficiency improvements: 30% smaller program, 10% faster, 75% less
dynamic memory consumption. It generates ANSI C as well as K&R C.
- The sources of all tools are in ANSI C as well as in K&R C.
- All tools generate modules in the target languages C (ANSI + K&R), C++,
and Modula-2.
- The interface to the operating system has been redesigned. It allows to
switch IO operations between UNIX system calls and C library calls.
This should assure much better portability.
Compiler Construction Tool Box
==============================
Rex (Regular EXpression tool) is a scanner generator whose
specifications are based on regular expressions and arbitrary
semantic actions written in one of the target languages C or
Modula-2. As scanners sometimes have to consider the context to
unambiguously recognize a token the right context can be speci-
fied by an additional regular expression and the left context can
be handled by so-called start states. The generated scanners
automatically compute the line and column position of the tokens
and offer an efficient mechanism to normalize identifiers and
keywords to upper or lower case letters. The scanners are table-
driven and run at a speed of 180,000 to 195,000 lines per minute
on a MC 68020 processor.
Lalr is a LALR(1) parser generator accepting grammars writ-
ten in extended BNF notation which may be augmented by semantic
actions expressed by statements of the target language. The gen-
erator provides a mechanism for S-attribution, that is syn-
thesized attributes can be computed during parsing. In case of
LR-conflicts unlike other tools Lalr provides not only informa-
tion about an internal state consisting of a set of items but it
prints a derivation tree which is much more useful to analyze the
problem. Conflicts can be resolved by specifying precedence and
associativity of operators and productions. The generated parsers
include automatic error recovery, error messages, and error
repair. The parsers are table-driven and run at a speed of
560,000 lines per minute. Currently parsers can be generated in
the target languages C and Modula-2.
Ell is a LL(1) parser generator accepting the same specifi-
cation language as Lalr except that the grammars must obey the
LL(1) property. It is possible to evaluate an L-attribution
during parsing. The generated parsers include automatic error
recovery, error messages, and error repair like Lalr. The
parsers are implemented following the recursive descent method
and reach a speed of 810,000 lines per minute. The possible tar-
get languages are again C and Modula-2.
Ast - A Generator for Abstract Syntax Trees
- generates abstract data types (program modules) to handle trees
- the trees may be attributed
- besides trees graphs are handled as well
- nodes may be associated with arbitrary many attributes of arbitrary type
- specifications are based on extended context-free grammars
- common notation for concrete and abstract syntax
- as well as for attributed trees and graphs
- an extension mechanism provides single inheritance
- trees are stored as linked records
- generates efficient program modules
- generates modules in Modula-2 or C
- provides many tree operations (procedures):
- node constructors combine aggregate notation and storage management
- ascii graph reader and writer
- binary graph reader and writer
- reversal of lists
- top down and bottom up traversal
- interactive graph browser
Ag - An Attribute Evaluator Generator
- processes ordered attribute grammars (OAGs)
- processes higher order attribute grammars (HAGs)
- operates on abstract syntax
- is based on tree modules generated by Ast
- the tree structure is fully known
- terminals and nonterminals may have arbitrary many attributes
- attributes can have any target language type
- allows tree-valued attributes
- differentiates input and output attributes
- allows attributes local to rules
- allows to eliminate chain rules
- offers an extension mechanism (single inheritance)
- attributes are denoted by unique selector names
instead of nonterminal names with subscripts
- attribute computations are expressed in the target language
- attribute computations are written in a functional style
- attribute computations can call external functions
- non-functiona